Improving Articulatory Feature and Phoneme Recognition Using Multitask Learning
نویسندگان
چکیده
Speech sounds can be characterized by articulatory features. Articulatory features are typically estimated using a set of multilayer perceptrons (MLPs), i.e., a separate MLP is trained for each articulatory feature. In this paper, we investigate multitask learning (MTL) approach for joint estimation of articulatory features with and without phoneme classification as subtask. Our studies show that MTL MLP can estimate articulatory features compactly and efficiently by learning the inter-feature dependencies through a common hidden layer representation. Furthermore, adding phoneme as subtask while estimating articulatory features improves both articulatory feature estimation and phoneme recognition. On TIMIT phoneme recognition task, articulatory feature posterior probabilities obtained by MTL MLP achieve a phoneme recognition accuracy of 73.2%, while the phoneme posterior probabilities achieve an accuracy of 74.0%.
منابع مشابه
Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملContinuous Episodic Memory Based Speech Recognition Using Articulatory Dynamics
In this paper we present a speech recognition system based on articulatory dynamics. We do not extend the acoustic feature with any explicit articulatory measurements but instead the articulatory dynamics of speech are structurally embodied within episodic memories. The proposed recognizer is made of different memories each specialized for a particular articulator. As all the articulators do no...
متن کاملMultiple Source Phoneme Recognition Aided by Articulatory Features
This paper presents an experiment in speech recognition whereby multiple phoneme recognisers are applied to the same utterance. When these recognisers agree on an hypothesis for the same time interval, that hypothesis is assumed to be correct. When they are in disagreement, fine-grained phonetic features, called articulatory features, recognised from the same speech utterance are used to create...
متن کاملOne-model speech recognition and synthesis based on articulatory movement HMMs
One-model speech recognition (SR) and speech synthesis (SS) based on a common articulatory movement model are described herein. The SR engine has an articulatory feature (AF) extractor and an HMM based classifier that models articulatory gestures. Experimental results of a phoneme recognition task show that the AF outperforms MFCC even if the training data are limited to a single speaker. In th...
متن کاملArticulatory-Acoustic-Feature-based Automatic Language Identification
Automatic language identification is one of the important topics in multilingual speech technology. Ideal language identification systems should be able to classify the language of speech utterances within a specific time before further processing by language-dependent speech recognition systems or monolingual listeners begins. Currently the best language identification systems are based on HMM...
متن کامل